C3: A Parallel Model for Coarse-Grained Machines

نویسندگان

  • Susanne E. Hambrusch
  • Ashfaq A. Khokhar
چکیده

In this paper, we propose a model for parallel computation, tile C3-modcl. The C3 _ model evaluates, for a given parallel algorithm and target architecture, the complexity of computation, the pattern of communication, and the potential congestion arising during communication. A metric for estimating the effect of link and processor congestion on the performance of a communication operation is developed. This metric allows the evaluation of arbiLrary communication operations without the user having to specify fine scheduling details. We describe how the C3-model can serve a'i a platform for the development of coarsegrained algorithms sensitive to the parameters of a parallel machine. The initial validation of the C3-model is discussed for the Inlel Touchstone Delta. We compare predicted and actual performance of different solutions for communication operations and of various divide-andconquer approaches for contour ranking on images.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

C3: an architecture-independent model for coarse-grained parallel machines

We propose an architecture-independent parallel model, the C 3-model. The C 3-model evaluates, for a given parallel algorithm and target architecture, the complexity of computation, the pattern of communication , and the potential congestion arising in communication operations. A metric for estimating the eeect of link and processor congestion on the performance of an arbitrary communication op...

متن کامل

Coarse grained parallel algorithms for graph matching

Parallel graph algorithm design is a very well studied topic. Many results have been presented for the PRAM model. However, these algorithms are inherently fine grained and experiments show that PRAM algorithms do often not achieve the expected speedup on real machines because of large message overheads. In this paper, we present coarse grained parallel graph algorithms with small message overh...

متن کامل

PACK/UNPACK on Coarse-Grained Distributed Memory Parallel Machines

PACK/UNPACK are Fortran 90/HPF array construction functions which derive new arrays from existing arrays. We present algorithms for performing these operations on coarse-grained parallel machines. Our algorithms are relatively architecture independent and can be applied to arrays of arbitrary dimensions with arbitrary distributionalong every dimension. Experimental results are presented on the

متن کامل

Communication-Efficient Deterministic Parallel Algorithms for Planar Point Location and 2d Voronoi Diagram

In this paper we describe deterministic parallel algorithms for planar point location and for building the Voronoi Diagram of n co-planar points. These algorithms are designed for BSP-like models of computation, where p processors, with O(~) ~> O(1) local memory each, communicate through some arbitrary interconnection network. They axe communication-efficient since they require, respectively, O...

متن کامل

Vector Prefix and Reduction Computation on Coarse-Grained, Distributed-Memory Parallel Machines

Vector prefix and reduction are collective communication primitives in which all processors must cooperate. We present two parallel algorithms, the direct algorithm and the split algorithm, for vector prefix and reduction computation on coarse-grained, distributed-memory parallel machines. Our algorithms are relatively architecture independent and can be used effectively in many applications su...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • J. Parallel Distrib. Comput.

دوره 32  شماره 

صفحات  -

تاریخ انتشار 1996